Skip to content

Conversation

@kramaranya
Copy link
Contributor

@kramaranya kramaranya commented Aug 1, 2025

What this PR does / why we need it:
I added a test-python-dev target and changed the unit test CI jobs with a dev/prod matrix

Dev leg installs the dev dependency group, which now pulls kubeflow_trainer_api from the master branch. Prod leg will run with the packages published to PyPI (once API models are available there)

I also updated the CI job to run unit tests for two Python vesions -- 3.9 and 3.11

I removed flake8 since ruff replaced it

/assign @andreyvelich @astefanutti @Electronic-Waste @tenzen-y @briangallagher

Checklist:

  • Docs included if any changes are user facing

@google-oss-prow
Copy link

@kramaranya: GitHub didn't allow me to assign the following users: briangallagher.

Note that only kubeflow members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

What this PR does / why we need it:
I added a test-python-dev target and changed the unit test CI jobs into a dev/prod matrix

Dev leg installs the dev dependency group, which now pulls kubeflow_trainer_api from the master branch. Prod leg will run with the packages published to PyPI (once API models are available there)

/assign @andreyvelich @astefanutti @Electronic-Waste @tenzen-y @briangallagher

Checklist:

  • Docs included if any changes are user facing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@coveralls
Copy link

coveralls commented Aug 1, 2025

Pull Request Test Coverage Report for Build 16750375247

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 66.038%

Totals Coverage Status
Change from base Build 16727070395: 0.0%
Covered Lines: 280
Relevant Lines: 424

💛 - Coveralls

Copy link
Member

@Electronic-Waste Electronic-Waste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kramaranya Thanks for this!
/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Aug 1, 2025
@astefanutti
Copy link
Contributor

/lgtm

That's great, thanks!

@kramaranya
Copy link
Contributor Author

@astefanutti @Electronic-Waste thanks for your review!
I’ve updated the CI to run unit tests on Python 3.9 and 3.11, and removed flake8 because ruff replaced it in #38
Could you please take a look again?

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @kramaranya!

[dependency-groups]
dev = [
"ruff>=0.12.2",
"kubeflow_trainer_api@git+https://github.com/kubeflow/trainer.git@master#subdirectory=api/python_api",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any use-case when users would like to install Kubeflow SDK with the latest module package ?
IIUC, the dependency-groups is never published to PyPI by hatch, so users can't install.
Thoughts @astefanutti @szaher @briangallagher @kramaranya @eoinfennessy ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've been thinking about adding this to project.optional-dependencies as well, but since we don't update API models that often, it makes sense to me to keep models from master available only for testing. However, I'm open to other suggestions if you see it differently

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view, since we allow to install control plane with the latest changes, users might want to experiment with it during the development: https://www.kubeflow.org/docs/components/trainer/operator-guides/installation/#installing-the-kubeflow-trainer-controller-manager

This gives opportunities for them to quickly adopt newer API changes into their systems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, let me update it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 4d8967f

Makefile Outdated
Comment on lines 70 to 79
.PHONY: test-python-dev
test-python-dev: uv-venv
@uv pip install --project $(PY_DIR) --group dev "$(PY_DIR)[test]"
@uv run coverage run --source=kubeflow.trainer.api.trainer_client,kubeflow.trainer.utils.utils -m pytest ./python/kubeflow/trainer/api/trainer_client_test.py
@uv run coverage report -m kubeflow/trainer/api/trainer_client.py kubeflow/trainer/utils/utils.py
ifeq ($(report),xml)
@uv run coverage xml
else
@uv run coverage html
endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we run tests over the master modules for now ?
Since we can't introduce any breaking changes to the Kubernetes API, I don't see a need to run tests against released modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we can't introduce any breaking changes to the Kubernetes API

Hmm, why can't we?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Kubernetes API best practices, if we introduce any breaking changes to the existing API: v1alpha1, we have to create the newer API version, for instance: v1alpha2: https://kubernetes.io/docs/reference/using-api/deprecation-policy/

cc @tenzen-y @astefanutti

In the future, we should discuss how to design our SDK to make it compatible with supported API versions, but we are not there yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for explanation! That makes sense now to test it only against master branch
Updated in 4d8967f

Comment on lines +8 to +10
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, we discussed before that this part is not needed given that we don't have limitations on concurrent runs due to GitHub enterprise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A use case is re-pushing the branch for the PR while the workflow is running. Even with no limitations, is that useful to keep the existing run going?

Copy link
Member

@andreyvelich andreyvelich Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to compare results, but I am fine to cancel them for now.
@kramaranya Please can you apply the change to all GitHub action tests in that case ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated in 4853da4

flag-name: ${{ matrix.dependency-group.name }}-py${{ matrix.python-version }}
parallel: true

coverage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we run coverage as part of the previous job in Upload coverage to Coveralls step ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I can see, it shows the report on the PRs: #56 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I added the parallel + “coverage-finished” step there when the matrix had four legs (dev/prod × two Python versions), so Coveralls could wait for every flag upload before closing the build and merging reports -- https://docs.coveralls.io/parallel-builds

But I've removed it in 189139d since we're now testing against the master modules only

@kramaranya
Copy link
Contributor Author

/test all

@google-oss-prow
Copy link

@kramaranya: No presubmit jobs available for kubeflow/sdk@main

In response to this:

/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also migrate to the official package as part of this PR: https://pypi.org/project/kubeflow-trainer-api/

Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
Signed-off-by: kramaranya <kramaranya15@gmail.com>
@kramaranya
Copy link
Contributor Author

@andreyvelich @astefanutti thank you for the review!
I should have addressed all your comments and imported kubeflow-trainer-api from PyPI

"kubeflow-trainer-api>=2.0.0",
.
Please let me know if there's anything else I should update!

@astefanutti
Copy link
Contributor

/lgtm

Thanks @kramaranya!

@google-oss-prow google-oss-prow bot added the lgtm label Aug 5, 2025
CONTRIBUTING.md Outdated

Install development tools:
```sh
pip install pytest black isort flake8 coverage pre-commit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, we can remove it since we have everything in dev, right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, updated in cb1c4df

Makefile Outdated
.PHONY: test-python
test-python: uv-venv
@uv pip install "./python[test]"
@uv pip install --project $(PY_DIR) --group dev "$(PY_DIR)[test,dev]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can we move ruff and test packages to dev too ?

In that case we have all tools required for development in one place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, shall we also add pre-commit package to dev?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sure!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, updated in cb1c4df!

Signed-off-by: kramaranya <kramaranya15@gmail.com>
@kramaranya
Copy link
Contributor Author

@andreyvelich thank you for the review! Let me please know if there's anything else I should update!

Makefile Outdated
.PHONY: test-python
test-python: uv-venv
@uv pip install "./python[test]"
@uv pip install --project $(PY_DIR) -e "$(PY_DIR)[dev]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this --project flag ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it for project isolation, but since we have only one project right now, I can remove it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 542ddcd

Signed-off-by: kramaranya <kramaranya15@gmail.com>
Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @kramaranya!
/lgtm
/approve

@google-oss-prow google-oss-prow bot added the lgtm label Aug 5, 2025
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 7614d74 into kubeflow:main Aug 5, 2025
9 checks passed
@google-oss-prow google-oss-prow bot added this to the v0.1 milestone Aug 5, 2025
@kramaranya kramaranya deleted the dev-testing branch August 8, 2025 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants